Language Modeling for Verbatim Translation Task

نویسندگان

  • Maxim Khalilov
  • José A.R. Fonollosa
چکیده

In this paper we present the first results towards finding the better TC-STAR 2006 verbatim transcription system configuration by means of improving the quality of language model performance. There is a present lack of research devoted to special techniques of verbatim translation, therefore we have made an attempt to improve translation accuracy by combining the Final Text Edition (FTE) system with supplementary verbatim corpus. Our work was focused on finding the best combination of the baseline (FTE) and verbatim language models for Spanish-English and English-Spanish language pairs. In order to improve the overall system performance standard n-gram based statistical machine translation (SMT) system was supplemented with a log linear combination of some additional feature functions and linguistically motivated word reordering technique. In the final part of the study we report the results of the baseline system translation accuracy in comparison with the FTE-verbatim interpolated language model systems for various proportions of the language models linear combination.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A WFST-based log-linear framework for speaking-style transformation

●Objective: Transform spoken-style language (V) into written style language (W) for the creation of transcripts ●Approach: Statistical machine translation to “translate” from verbatim text to written text ●Innovations: ●Log-linear modeling for improved accuracy ●Introduction of features to handle common phenomena in speaking-style transformation ●WFST-based implementation for integration with W...

متن کامل

The Effect of Genre Awareness on English Translation Quality and Pedagogy: A Case of News Reports Translation as an Academic Curriculum

To produce an adequate translation, language students are required to learn varieties of language features including syntax, semantics and pragmatics. Considering the curriculum language learners are face with, one can claim that almost all language students in Iran are taught these features in their academic settings including linguistic courses. Yet, there are some aspects of language which a...

متن کامل

Scaffolding for English as Foreign Language Writers: Writing a Scholastic Essay

This article describes how a group of Iranian upper-intermediate EFL learners were guided through the practice of writing their first academic essays in English. The method applied the principle of scaffolding to the essay writing process by providing flexible support for the learners during the writ- ing their first essays. Scaffolding included a number of aspects, each of which is explained i...

متن کامل

Word Confidence Estimation for Speech Translation

Word Confidence Estimation (WCE) for machine translation (MT) or automatic speech recognition (ASR) consists in judging each word in the (MT or ASR) hypothesis as correct or incorrect by tagging it with an appropriate label. In the past, this task has been treated separately in ASR or MT contexts and we propose here a joint estimation of word confidence for a spoken language translation (SLT) t...

متن کامل

Generic Analysis of Literary Translation: A Case Study of Contemporary English Short Stories

Translation of a literary text is a difficult task, for understanding literature requires knowledge of various linguistic levels of a literary text in addition to strategies and methods of translation. To this should still be added cognitive-based translation training which helps practitioners preserve the aesthetic aspects of a literary text. Focusing on short story as a genre with both ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006